{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 彩票上瘾脱瘾APP"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "许多人开始玩彩票是为了好玩，但对一些人来说，这种活动变成了一种习惯，最终演变成上瘾。和其他强迫性赌徒一样，彩票成瘾者很快就开始从他们的储蓄和贷款中消费，他们开始积累债务，最终陷入绝望的行为。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们希望开发一款专门的手机应用程序，帮助彩票成瘾者更好地估计自己中奖的几率，以此来帮助彩票成瘾者脱离瘾症。我们在这里创建应用程序的逻辑核心并计算概率。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于第一个版本的应用程序，我们将专注于49选6的大乐透彩票，并建立功能，使用户能够回答以下问题:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 用一张彩票赢得大奖的可能性有多大?\n",
    "* 如果我们玩40张不同的彩票(或任何其他号码)，赢得大奖的概率是多少?\n",
    "* 一张彩票中至少有五个(或四个、三个或两个)中奖号码的概率是多少?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们还将考虑来自加拿大（我们目前仅收集到加拿大的数据。希望大家可以丰富不同国家的历史数据）49选6彩票比赛的历史数据。数据集包含了3,665张图纸的[数据](https://www.kaggle.com/datascienceai/lottery-dataset)，时间跨度从1982年到2018年."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 核心函数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在整個項目中，我們需要反復計算概率和組合。因此，我們將首先編寫兩個我們經常使用的函數：\n",
    "\n",
    "* 計算階乘的函數; 和\n",
    "* 計算組合的函數。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "#计算阶乘\n",
    "def factorial(n):\n",
    "    s=1\n",
    "    for i in range(n,0,-1):\n",
    "        s=s*i\n",
    "    return s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算组合数目\n",
    "def combinations(n,k):\n",
    "    return factorial(n)/(factorial(k)*factorial(n-k))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 一张彩票中奖的概率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在6/49抽獎中，從一組49個數字中抽取六個數字，範圍從1到49.如果他們的票上的六個數字與所繪製的六個數字相匹配，則玩家贏得大獎。\n",
    "\n",
    "我們需要注意以下細節：\n",
    "\n",
    "* 在應用程序內部，用戶輸入從1到49的六個不同的數字。\n",
    "* 能以友好的方式打印概率值 - 以一種沒有任何概率訓練的人能夠理解的方式。\n",
    "\n",
    "下面，我們編寫one_ticket_probability（）函數，該函數接收六個唯一數字的列表，並以易於理解的方式打印獲勝概率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "你的号码14 2 33 4 26 6赢取大奖的概率是 0.0000072%,换句话说，你赢的概率是 13,983,816.0 分之一.\n"
     ]
    }
   ],
   "source": [
    "def one_ticket_probability(user_numbers):\n",
    "    n_combinations = combinations(49, 6)\n",
    "    probability_one_ticket = 1/n_combinations\n",
    "    percentage_form = probability_one_ticket * 100\n",
    "    \n",
    "    print('你的号码{}赢取大奖的概率是 {:.7f}%,换句话说，你赢的概率是 {:,} 分之一.'.format(user_numbers,percentage_form,n_combinations))\n",
    "    \n",
    "one_ticket_probability(\"14 2 33 4 26 6\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 历史数据检查"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在之前的屏幕上，我們編寫了一個函數，可以告訴用戶用一張票贏得大獎的概率是多少。然而，對於應用程序的第一個版本，用戶還應該能夠將他們的票證與歷史彩票數據（加拿大）進行比較，並確定他們現在是否曾經贏過。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3665, 11)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "lottery_canada = pd.read_csv('649.csv')\n",
    "lottery_canada.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PRODUCT</th>\n",
       "      <th>DRAW NUMBER</th>\n",
       "      <th>SEQUENCE NUMBER</th>\n",
       "      <th>DRAW DATE</th>\n",
       "      <th>NUMBER DRAWN 1</th>\n",
       "      <th>NUMBER DRAWN 2</th>\n",
       "      <th>NUMBER DRAWN 3</th>\n",
       "      <th>NUMBER DRAWN 4</th>\n",
       "      <th>NUMBER DRAWN 5</th>\n",
       "      <th>NUMBER DRAWN 6</th>\n",
       "      <th>BONUS NUMBER</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>649</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>6/12/1982</td>\n",
       "      <td>3</td>\n",
       "      <td>11</td>\n",
       "      <td>12</td>\n",
       "      <td>14</td>\n",
       "      <td>41</td>\n",
       "      <td>43</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>649</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>6/19/1982</td>\n",
       "      <td>8</td>\n",
       "      <td>33</td>\n",
       "      <td>36</td>\n",
       "      <td>37</td>\n",
       "      <td>39</td>\n",
       "      <td>41</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>649</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>6/26/1982</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "      <td>23</td>\n",
       "      <td>24</td>\n",
       "      <td>27</td>\n",
       "      <td>39</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PRODUCT  DRAW NUMBER  SEQUENCE NUMBER  DRAW DATE  NUMBER DRAWN 1  \\\n",
       "0      649            1                0  6/12/1982               3   \n",
       "1      649            2                0  6/19/1982               8   \n",
       "2      649            3                0  6/26/1982               1   \n",
       "\n",
       "   NUMBER DRAWN 2  NUMBER DRAWN 3  NUMBER DRAWN 4  NUMBER DRAWN 5  \\\n",
       "0              11              12              14              41   \n",
       "1              33              36              37              39   \n",
       "2               6              23              24              27   \n",
       "\n",
       "   NUMBER DRAWN 6  BONUS NUMBER  \n",
       "0              43            13  \n",
       "1              41             9  \n",
       "2              39            34  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lottery_canada.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 查验历史函数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我們將編寫一個功能，使用戶能夠將他們的票證與加拿大的歷史彩票數據進行比較，並確定他們現在是否曾贏過。\n",
    "\n",
    "我們需要了解以下細節：\n",
    "\n",
    "* 在應用程序內部，用戶輸入從1到49的六個不同的數字。\n",
    "* 編寫一個打印的功能：\n",
    "  * 所選組合在加拿大數據集中出現的次數; 和\n",
    "  * 使用該組合在下一張圖中贏得大獎的概率。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    {3, 41, 11, 12, 43, 14}\n",
       "1    {33, 36, 37, 39, 8, 41}\n",
       "2     {1, 6, 39, 23, 24, 27}\n",
       "3     {3, 9, 10, 43, 13, 20}\n",
       "4    {34, 5, 14, 47, 21, 31}\n",
       "dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def extract_numbers(row):\n",
    "    row=row[4:10]\n",
    "    row=set(row.values)\n",
    "    return row\n",
    "\n",
    "winning_numbers = lottery_canada.apply(extract_numbers,axis=1)\n",
    "winning_numbers.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面，我們編寫check_historical_occurrence（）函數，該函數接收用戶編號和歷史編號，並打印有關下一個圖形中出現次數和獲勝概率的信息。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def check_historical_occurrence(user_numbers, historical_numbers):\n",
    "    user_numbers_set=set(user_numbers)\n",
    "    check_occurrence = historical_numbers == user_numbers_set\n",
    "    n_occurrences = check_occurrence.sum()\n",
    "    \n",
    "    if n_occurrences == 0:\n",
    "        print('''改组合 {} 从未赢过。但这并不意味着它不会发生。你用该组合赢大奖的几率是\n",
    "0.0000072%.换句话说，你赢的几率是 13,983,816 分之一.'''.format(user_numbers, user_numbers))\n",
    "        \n",
    "    else:\n",
    "        print('''该组合 {} 在历史上赢过 {}次.但你用该组合赢的大奖的几率依旧是是\n",
    "0.0000072%.换句话说，你赢的几率是 13,983,816 分之一.'''.format(user_numbers, n_occurrences,\n",
    "                                                                            user_numbers))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "该组合 [33, 36, 37, 39, 8, 41] 在历史上赢过 1次.但你用该组合赢的大奖的几率依旧是是\n",
      "0.0000072%.换句话说，你赢的几率是 13,983,816 分之一.\n"
     ]
    }
   ],
   "source": [
    "test_input_3 = [33, 36, 37, 39, 8, 41]\n",
    "check_historical_occurrence(test_input_3, winning_numbers)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "改组合 [3, 2, 44, 22, 1, 44] 从未赢过。但这并不意味着它不会发生。你用该组合赢大奖的几率是\n",
      "0.0000072%.换句话说，你赢的几率是 13,983,816 分之一.\n"
     ]
    }
   ],
   "source": [
    "test_input_4 = [3, 2, 44, 22, 1, 44]\n",
    "check_historical_occurrence(test_input_4, winning_numbers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 多张彩票中奖概率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "彩票癮君子通常在一張圖紙上打一張以上的票，認為這可能會增加他們獲勝的機會。我們的目的是幫助他們更好地估計他們獲勝的機會 - 我們將編寫一個功能，允許用戶計算獲勝任意數量的不同門票的機會。\n",
    "* 用戶將輸入他們想要播放的不同票的數量（不輸入他們打算播放的特定組合）。\n",
    "* 我們的函數將看到1到13,983,816之間的整數（不同票證的最大數量）。\n",
    "* 該功能應根據所播放的不同門票的數量打印關於贏得大獎的概率的信息。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "#编写一个函数\n",
    "def multi_ticket_probability(n_tickets):\n",
    "    n_combinations = combinations(49, 6)\n",
    "    \n",
    "    probability = n_tickets / n_combinations\n",
    "    percentage_form = probability * 100\n",
    "    \n",
    "    if n_tickets == 1:\n",
    "        print('''你中大奖的几率是 {:.6f}%.换句话说，你赢的几率是 {:,} 分之一.'''.format(percentage_form, int(n_combinations)))\n",
    "    \n",
    "    else:\n",
    "        combinations_simplified = round(n_combinations / n_tickets)   \n",
    "        print('''你使用{:,}张彩票赢的几率是{:.6f}%.换句话说，你赢的几率是 {:,} 分之一.'''.format(n_tickets, percentage_form,\n",
    "                                                               combinations_simplified))\n",
    "                                  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面我们进行测试："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "你中大奖的几率是 0.000007%.换句话说，你赢的几率是 13,983,816 分之一.\n",
      "------------------------\n",
      "你使用10张彩票赢的几率是0.000072%.换句话说，你赢的几率是 1,398,382 分之一.\n",
      "------------------------\n",
      "你使用100张彩票赢的几率是0.000715%.换句话说，你赢的几率是 139,838 分之一.\n",
      "------------------------\n",
      "你使用10,000张彩票赢的几率是0.071511%.换句话说，你赢的几率是 1,398 分之一.\n",
      "------------------------\n",
      "你使用1,000,000张彩票赢的几率是7.151124%.换句话说，你赢的几率是 14 分之一.\n",
      "------------------------\n",
      "你使用6,991,908张彩票赢的几率是50.000000%.换句话说，你赢的几率是 2 分之一.\n",
      "------------------------\n",
      "你使用13,983,816张彩票赢的几率是100.000000%.换句话说，你赢的几率是 1 分之一.\n",
      "------------------------\n"
     ]
    }
   ],
   "source": [
    "test_inputs = [1, 10, 100, 10000, 1000000, 6991908, 13983816]\n",
    "\n",
    "for test_input in test_inputs:\n",
    "    multi_ticket_probability(test_input)\n",
    "    print('------------------------') # output delimiter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 较少号码中小奖的概率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在大多數6/49彩票中，如果玩家的票與所抽取的六個數字中的兩個，三個，四個或五個相匹配，則會有較小的獎品。這意味著玩家可能有興趣找出有兩個，三個，四個或五個中獎號碼的概率 - 對於應用程序的第一個版本，用戶應該能夠找到這些概率。\n",
    "\n",
    "當我們編寫一個函數來計算這些概率時，我們需要注意這些細節：\n",
    "\n",
    "* 在應用程序內部，用戶輸入：\n",
    "  * 六個不同的數字，從1到49; 和\n",
    "  * 2到5之間的整數，表示預期的中獎號碼數\n",
    "* 我們的功能打印有關獲得一定數量的中獎號碼概率的信息\n",
    "\n",
    "為了計算概率，特定組合是無關緊要的，我們只需要2到5之間的整數表示預期的中獎號碼數。因此，我們將編寫一個名為probability_less_6（）的函數，該函數接受一個整數，並根據該整數的值打印有關獲勝機會的信息。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "def probability_less_6(n_winning_numbers):\n",
    "    n_combinations_ticket = combinations(6, n_winning_numbers)\n",
    "    n_combinations_remaining = combinations(49 - n_winning_numbers,\n",
    "                                           6 - n_winning_numbers)\n",
    "    successful_outcomes = n_combinations_ticket * n_combinations_remaining\n",
    "    n_combinations_total = combinations(49, 6)\n",
    "    \n",
    "    probability = successful_outcomes / n_combinations_total\n",
    "    probability_percentage = probability * 100\n",
    "    \n",
    "    combinations_simplified = round(n_combinations_total/successful_outcomes)\n",
    "    \n",
    "    print('''你{}个号码中奖的概率是{:.6f}%.换句话说，你有 {:,} 分之一几率赢.'''.format(n_winning_numbers, probability_percentage,\n",
    "                                                               int(combinations_simplified)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "现在我们测试："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "你2个号码中奖的概率是19.132653%.换句话说，你有 5 分之一几率赢.\n",
      "--------------------------\n",
      "你3个号码中奖的概率是2.171081%.换句话说，你有 46 分之一几率赢.\n",
      "--------------------------\n",
      "你4个号码中奖的概率是0.106194%.换句话说，你有 942 分之一几率赢.\n",
      "--------------------------\n",
      "你5个号码中奖的概率是0.001888%.换句话说，你有 52,969 分之一几率赢.\n",
      "--------------------------\n"
     ]
    }
   ],
   "source": [
    "for test_input in [2, 3, 4, 5]:\n",
    "    probability_less_6(test_input)\n",
    "    print('--------------------------') # output delimiter"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
